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Claims 

Process for the detemunation of interacting biomolecules chaiacterized in that similar 
patterns of variation between two or more positions of at least two biomolecules are 
used 

Process for the determination of interacting biomolecules, 
characterized in that 

a) a first group is provided comprising sequences represmting homolo- 
gous biomolecules, 

b) at least one second group is provided coiiq)rising sequences repre- 
senting homologous biomolecules, 

c) groiq> correlation values between the sequences of fhe first group and 
the sequences of at least one second group are determined, and 

d) the probability of tibie interaction of the sequence represented bio* 
molecules is determined on the basis of the group correlation values. 

Process according to claim 2, characterized in that the probability of the interaction is 
calculated as predicted interaction value. 

Process according to claim 2 or 3, characterized in that the interacting biomolecules 
are those witti a positive predicted int^action value. 

\ 

Process according to any of claims 2 to 4, characterized in that any of the second 
groi^(s) is converted into the^first group and the first group is converted into a second 
group and group correlation values between the sequences of this new first group and 
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the sequences'^f any of the second group(s) which also comprises the fomer first 
group, are deti 

Process according t^any of claims 2 to 5, characterized in that 
site correlation valuels within each of the sequences within the first group and/or site 
correlation values witftm each of the sequences within the second group(s) are deter- 
mined and said site corrtelation values are used for the calculation of the probabiUty of 
interaction and/or for the Wculation of the predicted interaction value of the sequence 
represented biomolecules. 

Process according to claim 6, characterized in that the site correlation values are cor- 
relation values for substitutions within the sequences 

Process accordSuig to any of claims 2 to 7, characterized in that 

each sequence of^each of said groups is fiised to each other to fonn fused sequences 
comprising at leasf^one sequence of the first group and at least one sequence of any 
second group(s), 

the coirelation values within these fiised sequences are determined, and 

the correlation values are used as group correlation values for determining the pre- 
dicted interaction value an^or the probabihty of interaction. 

Process according to any of cMtns 2 to 8 characterized in that 
correlation values are determined by 

creating a position specific matrix (containing the distances between pairs of sequences 
at that position whereby the distancd^s are calculated by applying a standard distances 
matrix, 

creating a combined matrix for two positions by calculating the covariation coefficient 
between equivalent positions of their positim specific matrices, and 
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determiBing the coirel^^n value for a pair of positions by averaging the correlation 
values of the combined m^nx. 

10. Process accordmg to claim 9, characterized in that the standard distances matrix is the 
scoring matrix by McLachlan. 

1 1 . Method for the determination of interacting biomolecules which comprises processing 
data of at least a first set of data and at least a second set of data to output data 

wherein each of the sets of data comprises indq)endeiitly and individually at least one 
or more elements, 

wherein each of the elements rq)resents the sequence of a biomolecule, 

wherein the elemmts of a single set of data represent a group of homologous bio- 
molecules, 

wherein the ou^ut data comprises at least one pair of elements with one part of the 
pair of elements comprising at least one elraient from the first set of data and the other 
part of the pair of elements comprising at least one element from the second set of 
data, 

characterised in that 

a group correlation values data set is created comprising groi^ correlation val- 
ues which are determined betwem the sequences of the first set of data and at least the 
second set of data; 

an interaction probability data set is created by retrieving group correlation 
values fix)m the group correlation values data set and deteraodning the probability of 
interaction of the biomolecules based on the group correlation values; and 
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at least some of the elements from the first and at least the second set of data which 
have been used to create the group correlation values and the interaction probability 
therefrom form the output data. 

12. Method according to claim 1 1, characterized in that the probability of the interaction is 
calculated as predicted interaction value. 

13. Method according t^claim 1 1 or 12, characterized in that the elements Ifae predicted 
interaction value of ^ch is positive, are interacting biomolecules. 

14. Method according to any^f claims 1 1 to 13, characterized in fbat 

any of second set(s) of datkis converted into the first set of data and the first set of 
data is converted into a secona set of data, and 

group correlation values are dettenined between the sequences of this new first set of 
data and the sequences of any of me second set(s). 

15. Method according to any of claims l\ to 14, characterized in that 

site correlation values within each of th\ sequences within the first set of data and/or 
site correlation values within each of the si^uences withm the second set(s) of data are 
determined, and 

said site correlation values form a set-specific site correlation value data set. 

16. Method according to claim 15, characterized in that the set-specific site correlation 
value data set is used to calculate the probability of interaction of and/or to calculate 
the predicted interaction value of the sequence represented biomolecules. 

17. Method according to claimVs or 16, characterized in that the site correlation values 
are correlation values for substitutions within the sequences. 

1 8. Method according to any of claiirilB 1 1 to 1 7, characterized in that 
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a fiised element set W data is generated by combining each element of the first set of 
data individually witi^fijach element of any of the second set(s) of data, and 

attributing each fiised el«nent individually to the ftised element set of data. 

19. Method according to claim 18, characterized in that 

the correlation values are determined within the various positions of a single element 
of the fiised element set of data, and 
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the correlation values are used as group correlation values for detennining the prob- 
ability of the interaction of and/or predicted interaction value(s) of the biomolecules. 

20. Method according to any one of claims 11 to 19, characterized in that the correlation 
values ate determined by 

creating a position specific matrix containing the distances between pairs of sequences 
at that positions whereby the distances are calculated by applying a standard distances 
matrix, 

creating a combined\natrix for two positions by calculating the covariation coefficient 
between equivalent pos^ons of their position specific matrices, and 

determining the correlation\alue for a pair of positions by averaging the correlation 
values of the combined matrix. 

21. Method according to claim 20, characterized in that the standard distances matrix is 
the scoring matrix by McLachlarL 

22. Method according to any of claims 1 1 to 21, characterized in that the first set of data 
and/or second the second set(s) of data are retrieved fix)m a medium which is selected 
fiom the group cWprising databanks, linked databanks, textual data and sets of data 
generated by an anahrtical instrument. 
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23. Method abcordizig to any of claims 11 to 22, characterized in that the set(s) of data 
coxiq>rise anigned sequences. 

24. MeOiod accorlUng to any of claims 11 to 23, characterized in that the output data are 
output control diaracters for a target medium. 

25. Method or pioces^according to any of claims 2 to 24, characterized in that tiie se- 
quences of the first groiq} or second group(s) or first set of data or second 5et(s) of data 
are selected &om the\group comprising DNA sequences, RNA sequences and amino 
acid sequences. 

26. Method or process accorc^g to any of claims 2 to 25, diaracterized in that the number 
of sequences comprised in^y of the groups or any of the sets of data is at least , pref- 
erably at least 11. 

27. Method or process according tAany of claims 2 to 26, characterized in that the se- 
quences are homologous sequenc 

28. Method or process according to claim 27, characterized in that the homologous se- 
quences stem firom different origins. 

29. Method or process according to claim 27, characterized in that the homologous se- 
quences in the first set of data and in the second set of data stem fi'om the same origin 
and/or the homologous sequence in the first group and in the second group stem from 
the same origiiL 



Method dii process according to any of claims 27 to 29, characterized in that the ho- 
mologous seouences are homologous genes. 

Method or process according to claim 30, characterized in that the homologous genes 
are orthologs. 
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\ 32. Use of the method according to any of claims 11 to 31 for the simulation of bio- 
^^^■^^^ molecule intei^on. 

33. Use according to claim 32 wherein the interacting biomolecules are those with a posi- 
tive predicted interaction value determined by a process or method according to any of 
the preceeding claims. 

. • 34. Pairs of interacting biomolecules deteraiined according to a method or process ac- 
\ cording to smy of ^® claims 2 to 31 

3 5 . Data structure )^adable by a computer, said data structure being generated by a process 
or a method acconhng to any of claims 2 to 3 1. 

36. Computer readable medium for embodying or storing therein data readable by a com- 
puter, said medium comprising one or more of the following: 

a data stmcture generated by executing a process or a method according to any 
of claims 2 to 31; 

Computer program code meansyy^hich is adapted to cause a computer to exe- 
cute a process or method according>^o any one of claims 2 to 3 1 . 

37. Computer program product comprising the computer readable mediimi according to 
claim 36. 

38. Database contaijoink information on interacting sequence pairs generated by applying 
S^4a^^ X process or mefliocUccording to any of the claims 2 to 3 1 . 

39. Database according to claim 38, wherein the database is an organism/species specific 
database. 



Computer system cqm>rising an execution mviromnent for nmning the process or 
method according to wv of the claims 2 to 31. 
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41. Device for siinulaQng the interaction of biomolecules represented by their sequences 
which comprises \ 

a loading device for m^ing available the sets of data according to any of the claims 
11 to 31, \ 

a processing device for perfanning the method according to any of the claims 1 1 to 31, 
an output device for receiving thespu^ut data generated by the processing device. 



